31 research outputs found
New Algorithms for Position Heaps
We present several results about position heaps, a relatively new alternative
to suffix trees and suffix arrays. First, we show that, if we limit the maximum
length of patterns to be sought, then we can also limit the height of the heap
and reduce the worst-case cost of insertions and deletions. Second, we show how
to build a position heap in linear time independent of the size of the
alphabet. Third, we show how to augment a position heap such that it supports
access to the corresponding suffix array, and vice versa. Fourth, we introduce
a variant of a position heap that can be simulated efficiently by a compressed
suffix array with a linear number of extra bits
Standardized next-generation sequencing of immunoglobulin and T-cell receptor gene recombinations for MRD marker identification in acute lymphoblastic leukaemia; a EuroClonality-NGS validation study
Amplicon-based next-generation sequencing (NGS) of immunoglobulin (IG) and T-cell receptor (TR) gene rearrangements
for clonality assessment, marker identification and quantification of minimal residual disease (MRD) in lymphoid neoplasms
has been the focus of intense research, development and application. However, standardization and validation in a
scientifically controlled multicentre setting is still lacking. Therefore, IG/TR assay development and design, including
bioinformatics, was performed within the EuroClonality-NGS working group and validated for MRD marker identification
in acute lymphoblastic leukaemia (ALL). Five EuroMRD ALL reference laboratories performed IG/TR NGS in 50
diagnostic ALL samples, and compared results with those generated through routine IG/TR Sanger sequencing. A central
polytarget quality control (cPT-QC) was used to monitor primer performance, and a central in-tube quality control (cIT-QC)
wa
On the Number of Elements to Reorder When Updating a Suffix Array
Recently new algorithms appeared for updating the Burrows-Wheeler transform or the suffix array, when the text they index is modified. These algorithms proceed by reordering entries and the number of such reordered entries may be as high as the length of the text. However, in practice, these algorithms are faster for updating the Burrows-Wheeler transform or the suffix array than the fastest reconstruction algorithms. In this article we focus on the number of elements to be reordered for real-life texts. We show that this number is related to LCP values and that, on average, Lave entries are reordered, where Lave denotes the average LCP value, defined as the average length of the longest common prefix between two consecutive sorted suffixes. Since we know little about the LCP distribution for real-life texts, we conduct experiments on a corpus that consists of DNA sequences and natural language texts. The results show that apart from texts containing large repetitions, the average LCP value is close to the one expected on a random text
Average Complexity for Updating a Suffix Array
International audienc